7 research outputs found

    Generating 3D faces using Convolutional Mesh Autoencoders

    Full text link
    Learned 3D representations of human faces are useful for computer vision problems such as 3D face tracking and reconstruction from images, as well as graphics applications such as character generation and animation. Traditional models learn a latent representation of a face using linear subspaces or higher-order tensor generalizations. Due to this linearity, they can not capture extreme deformations and non-linear expressions. To address this, we introduce a versatile model that learns a non-linear representation of a face using spectral convolutions on a mesh surface. We introduce mesh sampling operations that enable a hierarchical mesh representation that captures non-linear variations in shape and expression at multiple scales within the model. In a variational setting, our model samples diverse realistic 3D faces from a multivariate Gaussian distribution. Our training data consists of 20,466 meshes of extreme expressions captured over 12 different subjects. Despite limited training data, our trained model outperforms state-of-the-art face models with 50% lower reconstruction error, while using 75% fewer parameters. We also show that, replacing the expression space of an existing state-of-the-art face model with our autoencoder, achieves a lower reconstruction error. Our data, model and code are available at http://github.com/anuragranj/com

    SCULPT: Shape-Conditioned Unpaired Learning of Pose-dependent Clothed and Textured Human Meshes

    Full text link
    We present SCULPT, a novel 3D generative model for clothed and textured 3D meshes of humans. Specifically, we devise a deep neural network that learns to represent the geometry and appearance distribution of clothed human bodies. Training such a model is challenging, as datasets of textured 3D meshes for humans are limited in size and accessibility. Our key observation is that there exist medium-sized 3D scan datasets like CAPE, as well as large-scale 2D image datasets of clothed humans and multiple appearances can be mapped to a single geometry. To effectively learn from the two data modalities, we propose an unpaired learning procedure for pose-dependent clothed and textured human meshes. Specifically, we learn a pose-dependent geometry space from 3D scan data. We represent this as per vertex displacements w.r.t. the SMPL model. Next, we train a geometry conditioned texture generator in an unsupervised way using the 2D image data. We use intermediate activations of the learned geometry model to condition our texture generator. To alleviate entanglement between pose and clothing type, and pose and clothing appearance, we condition both the texture and geometry generators with attribute labels such as clothing types for the geometry, and clothing colors for the texture generator. We automatically generated these conditioning labels for the 2D images based on the visual question answering model BLIP and CLIP. We validate our method on the SCULPT dataset, and compare to state-of-the-art 3D generative models for clothed human bodies. We will release the codebase for research purposes

    ALIGNED DISCRIMINATIVE POSE ROBUST DESCRIPTORS FOR FACE AND OBJECT RECOGNITION

    No full text
    Face and object recognition in uncontrolled scenarios due to pose and illumination variations, low resolution, etc. is a challenging research area. Here we propose a novel descriptor, Aligned Discriminative Pose Robust (ADPR) descriptor, for matching faces and objects across pose which is also robust to resolution and illumination variations. We generate virtual intermediate pose subspaces from training examples at a few poses and compute the alignment matrices of those subspaces with the frontal subspace. These matrices are then used to align the generated subspaces with the frontal one. An image is represented by a feature set obtained by projecting its low-level feature on these aligned subspaces and applying a discriminative transform. Finally, concatenating all the features we generate the ADPR descriptor. We perform experiments on face and object databases across pose, pose and resolution, and compare with state-of-the-art methods including deep learning approaches to show the effectiveness of our descriptor

    GenLR-Net: Deep framework for very low resolution face and object recognition with generalization to unseen categories

    No full text
    Matching very low resolution images of faces and objects with high resolution images in the database has important applications in surveillance scenarios, street-to-shop matching for general objects, etc. Matching across huge resolution difference along with variations in illumination, view-point, etc. makes the problem quite challenging. The problem becomes even more difficult if the testing objects have not been seen during training. In this work, we propose a novel deep convolutional neural network architecture to address these problems. We systematically introduce different kinds of constraints at different stages of the architecture so that the approach can recognize low resolution images as well as generalize well to images of unseen categories. The reason behind each additional step along with its effect on the overall performance is thoroughly analyzed. Extensive experiments are conducted on two face and object datasets which justifies the effectiveness of the proposed approach for handling these real-life challenging scenarios

    Discriminative pose-free descriptors for face and object matching

    No full text
    Pose invariant matching is a very important problem with various applications like recognizing faces in uncontrolled scenarios in which the facial images appear in wide variety of pose and illumination conditions along with low resolution. Here we propose two discriminative pose-free descriptors, Subspace Point Representation (DPF-SPR) and Layered Canonical Correlated (DPF-LCC) descriptor, for matching faces and objects across pose. Training examples at very few poses are used to generate virtual intermediate pose subspaces. An image is represented by a feature set obtained by projecting its low-level feature on these subspaces and a discriminative transform is applied to make this feature set suitable for recognition. We represent this discriminative feature set by two novel descriptors. In one approach, we transform it to a vector by using subspace to point representation technique. In the second approach, a layered structure of canonical correlated subspaces are formed, onto which the feature set is projected. Experiments on recognizing faces and objects across pose and comparisons with state-of-the-art show the effectiveness of the proposed approach. (C) 2017 Elsevier Ltd. All rights reserved

    Discriminative Pose-Free Descriptors for Face and Object Matching

    No full text
    Pose invariant matching is a very important and challenging problem with various applications like recognizing faces in uncontrolled scenarios, matching objects taken from different view points, etc. In this paper, we propose a discriminative pose-free descriptor (DPFD) which can be used to match faces/objects across pose variations. Training examples at very few representative poses are used to generate virtual intermediate pose subspaces. An image or image region is then represented by a feature set obtained by projecting it on all these subspaces and a discriminative transform is applied on this feature set to make it suitable for classification tasks. Finally, this discriminative feature set is represented by a single feature vector, termed as DPFD. The DPFD of images taken from different viewpoints can be directly compared for matching. Extensive experiments on recognizing faces across pose, pose and resolution on the Multi-PIE and Surveillance Cameras Face datasets and comparisons with state-of-the-art approaches show the effectiveness of the proposed approach. Experiments on matching general objects across viewpoints show the generalizability of the proposed approach beyond faces
    corecore